On Determining the Number of Clusters–a Comparative Study

نویسندگان

  • Brian Bies
  • Kathryn Dabbs
  • Hao Zou
چکیده

In this paper, we perform one of the first empirical tests comparing several existing algorithms for determining the number of clusters in a data set (the gap statistic, X-means, G-means, data spectroscopic clustering and self-tuning spectral clustering). We use a large number of data sets randomly generated with varying distributions and parameters. The results show that the G-means and X-means perform best on the majority of test cases. We also explore ways to improve the gap statistic, and formulate the problem in the simplified continuous context to consider its theoretical basis.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Oil Reservoirs Classification Using Fuzzy Clustering (RESEARCH NOTE)

Enhanced Oil Recovery (EOR) is a well-known method to increase oil production from oil reservoirs. Applying EOR to a new reservoir is a costly and time consuming process. Incorporating available knowledge of oil reservoirs in the EOR process eliminates these costs and saves operational time and work. This work presents a universal method to apply EOR to reservoirs based on the available data by...

متن کامل

The role of Tourism clusters on regional competitiveness ( Case study: Religious tourism cluster of Qom)

As a novel idea for discussion on the role of industrial development on regional development, the term “cluster” became noteworthy since 90’s in order to increase competitiveness. Territorial development researchers believe that formation of regional industrial clusters improves competitiveness and plays a role in promoting competitive advantages and regional development. Hence, because of the ...

متن کامل

A comparative study on morphometric and meristic characters of Nemipterus japonicus (Bloch, 1791) in the coasts of India

A total of 200 threadfin bream Nemipterus japonicus was collected from Chennai in the east coast and Kochi in the west coast during January to February 2013 and studied for their morphometric and meristic characters. In total, 21 characters were analyzed out of which 3 characters namely dorsal, ventral and anal fins were not considered since they were unaltered. The findings indicate that 91% o...

متن کامل

Designing a Supplementary Health Insurance Model for Iran

Designing a Supplementary Health Insurance Model for Iran Ali Vafaee Najar 1, Elaheh Hooshmand 1, * 1Social Determinates of Health Research Center, Mashhad University of Medical Sciences, Mashhad, Iran Abstract Background: Considering the importance of complementary health insurance and the necessity of designing a supplementary insurance model for the health system of the country, the purpose...

متن کامل

Investigation and Comparison of Nursing Informatics Development Factors in Iran and Selected Countries: a Comparative Study

Introduction: Nursing informatics is effective on the optimal use of technology and health promotion. Using the experiences of successful countries in this field, can facilitate the improvement of nursing care quality. The purpose of this study is a comparison of nursing informatics development factors in Iran and leading countries. Method: The present study was conducted in a descriptive-compa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009